Overview

Dataset Statistics

Number of Variables 10
Number of Rows 20640
Missing Cells 207
Missing Cells (%) 0.1%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 2.7 MB
Average Row Size in Memory 137.1 B
Variable Types
  • Numerical: 9
  • Categorical: 1

Dataset Insights

total_bedrooms and households have similar distributions Similar Distribution
longitude is skewed Skewed
latitude is skewed Skewed
total_rooms is skewed Skewed
total_bedrooms is skewed Skewed
population is skewed Skewed
households is skewed Skewed
longitude has 20640 (100.0%) negatives Negatives

Variables


longitude

numerical

Approximate Distinct Count 844
Approximate Unique (%) 4.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 322.5 KB
Mean -119.5697
Minimum -124.35
Maximum -114.31
Zeros 0
Zeros (%) 0.0%
Negatives 20640
Negatives (%) 100.0%
  • longitude is skewed left (γ1 = -0.2978)

Quantile Statistics

Minimum -124.35
5-th Percentile -122.47
Q1 -121.8
Median -118.49
Q3 -118.01
95-th Percentile -117.08
Maximum -114.31
Range 10.04
IQR 3.79

Descriptive Statistics

Mean -119.5697
Standard Deviation 2.0035
Variance 4.0141
Sum -2.4679e+06
Skewness -0.2978
Kurtosis -1.3301
Coefficient of Variation -0.01676
  • longitude is not normally distributed (p-value 4.000582473597246e-07)

latitude

numerical

Approximate Distinct Count 862
Approximate Unique (%) 4.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 322.5 KB
Mean 35.6319
Minimum 32.54
Maximum 41.95
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • latitude is skewed right (γ1 = 0.4659)

Quantile Statistics

Minimum 32.54
5-th Percentile 32.82
Q1 33.93
Median 34.26
Q3 37.71
95-th Percentile 38.96
Maximum 41.95
Range 9.41
IQR 3.78

Descriptive Statistics

Mean 35.6319
Standard Deviation 2.136
Variance 4.5623
Sum 735441.62
Skewness 0.4659
Kurtosis -1.1178
Coefficient of Variation 0.05995
  • latitude is not normally distributed (p-value 4.733293899569894e-12)

housing_median_age

numerical

Approximate Distinct Count 52
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 322.5 KB
Mean 28.6395
Minimum 1
Maximum 52
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • housing_median_age is skewed right (γ1 = 0.0603)

Quantile Statistics

Minimum 1
5-th Percentile 8
Q1 18
Median 29
Q3 37
95-th Percentile 52
Maximum 52
Range 51
IQR 19

Descriptive Statistics

Mean 28.6395
Standard Deviation 12.5856
Variance 158.3963
Sum 591119
Skewness 0.06033
Kurtosis -0.8007
Coefficient of Variation 0.4394
  • housing_median_age is not normally distributed (p-value 3.198980878617318e-05)

total_rooms

numerical

Approximate Distinct Count 5926
Approximate Unique (%) 28.7%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 322.5 KB
Mean 2635.7631
Minimum 2
Maximum 39320
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • total_rooms is skewed right (γ1 = 4.147)

Quantile Statistics

Minimum 2
5-th Percentile 620.95
Q1 1447.75
Median 2127
Q3 3148
95-th Percentile 6213.2
Maximum 39320
Range 39318
IQR 1700.25

Descriptive Statistics

Mean 2635.7631
Standard Deviation 2181.6153
Variance 4.7594e+06
Sum 5.4402e+07
Skewness 4.147
Kurtosis 32.6227
Coefficient of Variation 0.8277
  • total_rooms is not normally distributed (p-value 9.48597437457448e-14)
  • total_rooms has 1287 outliers

total_bedrooms

numerical

Approximate Distinct Count 1923
Approximate Unique (%) 9.4%
Missing 207
Missing (%) 1.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 319.3 KB
Mean 537.8706
Minimum 1
Maximum 6445
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • total_bedrooms is skewed right (γ1 = 3.4593)

Quantile Statistics

Minimum 1
5-th Percentile 137
Q1 296
Median 435
Q3 647
95-th Percentile 1275.4
Maximum 6445
Range 6444
IQR 351

Descriptive Statistics

Mean 537.8706
Standard Deviation 421.3851
Variance 177565.3773
Sum 1.099e+07
Skewness 3.4593
Kurtosis 21.9799
Coefficient of Variation 0.7834
  • total_bedrooms is not normally distributed (p-value 2.1687944946982806e-12)
  • total_bedrooms has 1271 outliers

population

numerical

Approximate Distinct Count 3888
Approximate Unique (%) 18.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 322.5 KB
Mean 1425.4767
Minimum 3
Maximum 35682
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • population is skewed right (γ1 = 4.9355)

Quantile Statistics

Minimum 3
5-th Percentile 348
Q1 787
Median 1166
Q3 1725
95-th Percentile 3288
Maximum 35682
Range 35679
IQR 938

Descriptive Statistics

Mean 1425.4767
Standard Deviation 1132.4621
Variance 1.2825e+06
Sum 2.9422e+07
Skewness 4.9355
Kurtosis 73.535
Coefficient of Variation 0.7944
  • population is not normally distributed (p-value 3.2126712720360756e-18)
  • population has 1196 outliers

households

numerical

Approximate Distinct Count 1815
Approximate Unique (%) 8.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 322.5 KB
Mean 499.5397
Minimum 1
Maximum 6082
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • households is skewed right (γ1 = 3.4102)

Quantile Statistics

Minimum 1
5-th Percentile 125
Q1 280
Median 409
Q3 605
95-th Percentile 1162
Maximum 6082
Range 6081
IQR 325

Descriptive Statistics

Mean 499.5397
Standard Deviation 382.3298
Variance 146176.0399
Sum 1.031e+07
Skewness 3.4102
Kurtosis 22.0524
Coefficient of Variation 0.7654
  • households is not normally distributed (p-value 2.432590470464298e-12)
  • households has 1220 outliers

median_income

numerical

Approximate Distinct Count 12928
Approximate Unique (%) 62.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 322.5 KB
Mean 3.8707
Minimum 0.4999
Maximum 15.0001
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • median_income is skewed right (γ1 = 1.6465)

Quantile Statistics

Minimum 0.4999
5-th Percentile 1.6006
Q1 2.5634
Median 3.5348
Q3 4.7432
95-th Percentile 7.3003
Maximum 15.0001
Range 14.5002
IQR 2.1799

Descriptive Statistics

Mean 3.8707
Standard Deviation 1.8998
Variance 3.6093
Sum 79890.6495
Skewness 1.6465
Kurtosis 4.951
Coefficient of Variation 0.4908
  • median_income is not normally distributed (p-value 0.00791974194206781)
  • median_income has 681 outliers

median_house_value

numerical

Approximate Distinct Count 3842
Approximate Unique (%) 18.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 322.5 KB
Mean 206855.8169
Minimum 14999
Maximum 500001
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • median_house_value is skewed right (γ1 = 0.9777)

Quantile Statistics

Minimum 14999
5-th Percentile 66200
Q1 119600
Median 179700
Q3 264725
95-th Percentile 489810
Maximum 500001
Range 485002
IQR 145125

Descriptive Statistics

Mean 206855.8169
Standard Deviation 115395.6159
Variance 1.3316e+10
Sum 4.2695e+09
Skewness 0.9777
Kurtosis 0.3275
Coefficient of Variation 0.5579
  • median_house_value is not normally distributed (p-value 0.0002598425332835302)
  • median_house_value has 1071 outliers

ocean_proximity

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1.4 MB

Length

Mean 8.0649
Standard Deviation 1.4914
Median 9
Minimum 6
Maximum 10

Sample

1st row NEAR BAY
2nd row NEAR BAY
3rd row NEAR BAY
4th row NEAR BAY
5th row NEAR BAY

Letter

Count 134104
Lowercase Letter 0
Space Separator 14084
Uppercase Letter 134104
Dash Punctuation 0
Decimal Number 9136
  • The top 2 categories (<1H OCEAN, INLAND) take over 50.0%

Interactions

Correlations

Missing Values